Activity Analysis in Unconstrained Surveillance Videos

نویسندگان

  • Mahmudul Hasan
  • Yingying Zhu
  • Santhoshkumar Sunderrajan
  • Niloufar Pourian
  • Amit Roy-Chowdhury
چکیده

We detect seven activities defined by TRECVID SED task such as CellToEar, Embrace, ObjectPut, PeopleMeet, PeopleSplitUp, PersonRuns, and Pointing. We employ two different strategies to detect these activities based on their characteristics. Activities like CellToEar, Embrace, ObjectPut, and Pointing are the results of articulated motion of human parts. Therefore, we employ local spatio-temporal interest point (STIP) feature based bag of words strategy for these activities. Visual vocabularies are constructed from the STIP features and each activity is described by the histograms of visual words. We also construct activity probability map for each camera-activity pair that reflects the spatial distribution of an activity in a camera. We train a discriminative SVM classifier using Gaussian kernel for each camera-activity pair. During evaluation we employ sliding window based technique. We slide spatio-temporal cuboids in both spatial and temporal direction to find a likely activity. The cuboid is also described by the histograms of visual words and final decision is made using the SVM classifier and the activity probability map. For the activities like PeopleMeet, PeopleSplitUp, and PersonRuns, the characteristics of trajectories of persons of interest in the activities are discriminative. For instance, trajectories of PeopleMeet converge along time while those of PeopleSplitUp diverge along time. Therefore, we use track-based string of feature graph (SFG) to recognize these activities. Results of our experimental runs on the evaluation videos are comparable with other participants. Our performances in all the activities are among the top five teams.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Face Detection and Verification in Unconstrained Videos: Challenges, Detection, and Benchmark Evaluation

With increasing security concerns, surveillance cameras are playing an important role in the society and face recognition in crowd is gaining more importance than ever. For video face recognition, researchers have primarily focused on controlled environments with a single person in a frame. However, in real world surveillance situations, the environment is unconstrained and the videos are likel...

متن کامل

A Videography Analysis Framework for Video Retrieval and Summarization

Overview: In this work, we focus on developing features and approaches to represent and analyze videography styles in unconstrained videos. By unconstrained videos, we mean typical consumer videos with significant content complexity and diverse editing artifacts, mostly with long duration. We present an approach for unsupervised videography analysis for unconstrained videos. Intuitively, each v...

متن کامل

Ranking Domain-Specific Highlights by Analyzing Edited Videos

We present a fully automatic system for ranking domainspecific highlights in unconstrained personal videos by analyzing online edited videos. A novel latent linear ranking model is proposed to handle noisy training data harvested online. Specifically, given a search query (domain) such as “surfing”, our system mines the Youtube database to find pairs of raw and corresponding edited videos. Leve...

متن کامل

Video Précis: Highlighting Diverse Aspects of Videos

Summarizing long unconstrained videos is gaining importance in surveillance, web-based video browsing, and video-archival applications. Summarizing a video requires one to identify key aspects that contain the essence of the video. In this paper, we propose an approach that optimizes two criteria that a video summary should embody. The first criterion, ‘coverage’, requires that the summary be a...

متن کامل

Complex event recognition using constrained low-rank representation

a r t i c l e i n f o Complex event recognition is the problem of recognizing events in long and unconstrained videos. In this extremely challenging task, concepts have recently shown a promising direction where core low-level events (referred to as concepts) are annotated and modeled using a portion of the training data, then each complex event is described using concept scores, which are feat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013